Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

application: extract path prefix logic to middleware #2733

Merged
merged 5 commits into from
Oct 7, 2019

Conversation

wchargin
Copy link
Contributor

@wchargin wchargin commented Oct 6, 2019

Summary:
This way, we can reason about and test it in isolation as we add more
middleware to the application.

Note that when using --path_prefix, loading the main TensorBoard page
without a trailing slash on the URL has long been broken (#1117). This
commit does not fix that, but changes the exact failure mode: rather
than 404ing, we now load a broken TensorBoard (relative URLs don’t
resolve). A follow-up commit will fix this properly.

Test Plan:
Verify that TensorBoard still works fully, with --path_prefix and
without it.

wchargin-branch: path-prefix-middleware

Summary:
Currently, we apply the `_handle_errors` middleware as a decorator
directly on `__call__`. We plan to add middleware that is parameterized
on instance variables, so this pattern will no longer suffice. This
commit simply refactors the existing code to apply middleware at
initialization time.

Test Plan:
It suffices to run `bazel test //tensorboard/backend:application_test`.

wchargin-branch: app-middleware
wchargin-source: 77c4f793de8b64d88bd10212db5a0191995b29ca
Summary:
This way, we can reason about and test it in isolation as we add more
middleware to the application.

Note that when using `--path_prefix`, loading the main TensorBoard page
without a trailing slash on the URL has long been broken (#1117). This
commit does not fix that, but changes the exact failure mode: rather
than 404ing, we now load a broken TensorBoard (relative URLs don’t
resolve). A follow-up commit will fix this properly.

Test Plan:
Verify that TensorBoard still works fully, with `--path_prefix` and
without it.

wchargin-branch: path-prefix-middleware
wchargin-source: 3e07d2b681f12c4bab310bef2afc066590ae5cb7
wchargin added a commit that referenced this pull request Oct 6, 2019
Summary:
Specifying a path prefix without leading slash has always been
disallowed by the docs, but in TensorBoard 2.0.0 this just hits an
inscrutable internal error when starting the server:

```
ValueError: invalid literal for int() with base 10: '6006my_path_prefix'
```

As of #2733, this is at least a correctly reported error, but it still
has a stack trace. This commit makes it a normal, fast-failing usage
error.

The logic to strip trailing slashes was also duplicated, so `/foo` and
`/foo/` and `/foo//` were all the same, but differed from `/foo///`.
This commit removes one occurrence and makes the remaining one remove
all trailing slashes instead of just one.

Test Plan:
Run with `--path_prefix /foo`, and note that TensorBoard exits with a
single-line error. Run with `/foo`, `/foo/`, `/foo//`, and `/foo///`,
and note that all four now have the same behavior.

wchargin-branch: path-prefix-validation
wchargin-source: 514c3bc317a1b90561f29f378919424b4284cc66

with self.subTest("at root"):
response = server.get("/pfx")
self._assert_ok(response, path="", script="/pfx")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no action required; I simply dunno: how does our server respond when the path is "" vs. "/"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. The answer is that Werkzeug normalizes that out, so our
app (which uses Werkzeug for routing) only sees request.path == "/":

# main.py
import werkzeug

def bare_app(environ, start_response):
  start_response("200 OK", [("Content-Type", "text/plain")])
  return [repr(environ.get("PATH_INFO")).encode("utf-8")]

@werkzeug.Request.application
def werkzeug_app(request):
  return werkzeug.Response(repr(request.path), content_type="text/plain")

start_response = lambda status, headers: None
request_empty = {"REQUEST_METHOD": "GET", "PATH_INFO": ""}
request_root = {"REQUEST_METHOD": "GET", "PATH_INFO": "/"}

print("With base WSGI:")
print(list(bare_app(request_empty, start_response)))
print(list(bare_app(request_root, start_response)))

print("With Werkzeug:")
print(list(werkzeug_app(request_empty, start_response)))
print(list(werkzeug_app(request_root, start_response)))
$ python main.py
With base WSGI:
[b"''"]
[b"'/'"]
With Werkzeug:
[b"'/'"]
[b"'/'"]

wchargin-branch: path-prefix-middleware
wchargin-source: e74bb6f8dd0aef732604991f63d1803b0ce2a206

# Conflicts:
#	tensorboard/backend/application.py
wchargin-branch: path-prefix-middleware
wchargin-source: e74bb6f8dd0aef732604991f63d1803b0ce2a206
@wchargin wchargin changed the base branch from wchargin-app-middleware to master October 7, 2019 18:52
from tensorboard import errors


class PathPrefixMiddleware(object):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I had been thinking we could generalize this as taking a dict mapping prefixes to WSGI apps. This unblocks using this for the plugin routing as well as described in #2573.

Feel free to leave this PR as-is, but just FYI, I think it will make sense to fold this into a more generic middleware sooner or later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me.

path_prefix: A string path prefix to be stripped from incoming
requests. If empty, this middleware is a no-op. If non-empty,
the path prefix must start with a slash and not end with one
(e.g., "/tensorboard").
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO the semantics are more obvious if we require that the path prefix end in a slash, since then we don't necessarily even need to support the redirect, and there's no confusion about what the canonical URL will be.

WDYT? I think we could fairly consider that not to be a breaking change given that using the prefix without the trailing slash is broken today due to the lack of redirect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m fine with making that change at the user level (i.e., a requirement
on the flag value). As an implementation detail, I find it useful to
represent path components like this with a leading slash and no trailing
slash, because then they concatenate properly. This is also what
Werkzeug does for the .script_root property (they don’t expose
SCRIPT_NAME correctly), and is the convention that application.py
uses for constants like DATA_PREFIX = "/data".

Given that this PR doesn’t actually make any user-facing changes modulo
existing undefined behavior, I’m comfortable deferring the tightening of
the flag domain. Let me know if you object.

wchargin-branch: path-prefix-middleware
wchargin-source: e74bb6f8dd0aef732604991f63d1803b0ce2a206
@wchargin wchargin merged commit 0981d8a into master Oct 7, 2019
@wchargin wchargin deleted the wchargin-path-prefix-middleware branch October 7, 2019 20:25
wchargin added a commit that referenced this pull request Oct 8, 2019
Summary:
Specifying a path prefix without leading slash has always been
disallowed by the docs, but in TensorBoard 2.0.0 this just hits an
inscrutable internal error when starting the server:

```
ValueError: invalid literal for int() with base 10: '6006my_path_prefix'
```

As of #2733, this is at least a correctly reported error, but it still
has a stack trace. This commit makes it a normal, fast-failing usage
error.

The logic to strip trailing slashes was also duplicated, so `/foo` and
`/foo/` and `/foo//` were all the same but differed from `/foo///`. This
commit removes one copy of the logic and makes the remaining copy strip
all trailing slashes instead of just one.

Test Plan:
Run with `--path_prefix /foo`, and note that TensorBoard exits with a
single-line error. Run with `/foo`, `/foo/`, `/foo//`, and `/foo///`,
and note that all four now have the same behavior.

wchargin-branch: path-prefix-validation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants